21 research outputs found
Searching for test data with feature diversity
There is an implicit assumption in software testing that more diverse and
varied test data is needed for effective testing and to achieve different types
and levels of coverage. Generic approaches based on information theory to
measure and thus, implicitly, to create diverse data have also been proposed.
However, if the tester is able to identify features of the test data that are
important for the particular domain or context in which the testing is being
performed, the use of generic diversity measures such as this may not be
sufficient nor efficient for creating test inputs that show diversity in terms
of these features. Here we investigate different approaches to find data that
are diverse according to a specific set of features, such as length, depth of
recursion etc. Even though these features will be less general than measures
based on information theory, their use may provide a tester with more direct
control over the type of diversity that is present in the test data. Our
experiments are carried out in the context of a general test data generation
framework that can generate both numerical and highly structured data. We
compare random sampling for feature-diversity to different approaches based on
search and find a hill climbing search to be efficient. The experiments
highlight many trade-offs that needs to be taken into account when searching
for diversity. We argue that recurrent test data generation motivates building
statistical models that can then help to more quickly achieve feature
diversity.Comment: This version was submitted on April 14th 201
The Use of Automated Search in Deriving Software Testing Strategies
Testing a software artefact using every one of its possible inputs would normally cost too much, and take too long, compared to the benefits of detecting faults in the software. Instead, a testing strategy is used to select a small subset of the inputs with which to test the software. The criterion used to select this subset affects the likelihood that faults in the software will be detected. For some testing strategies, the criterion may result in subsets that are very efficient at detecting faults, but implementing the strategy -- deriving a 'concrete strategy' specific to the software artefact -- is so difficult that it is not cost-effective to use that strategy in practice.
In this thesis, we propose the use of metaheuristic search to derive concrete testing strategies in a cost-effective manner.
We demonstrate a search-based algorithm that derives concrete strategies for 'statistical testing', a testing strategy that has a good fault-detecting ability in theory, but which is costly to implement in practice. The cost-effectiveness of the search-based approach is enhanced by the rigorous empirical determination of an efficient algorithm configuration and associated parameter settings, and by the exploitation of low-cost commodity GPU cards to reduce the time taken by the algorithm. The use of a flexible grammar-based representation for the test inputs ensures the applicability of the algorithm to a wide range of software
Test Set Diameter: Quantifying the Diversity of Sets of Test Cases
A common and natural intuition among software testers is that test cases need
to differ if a software system is to be tested properly and its quality
ensured. Consequently, much research has gone into formulating distance
measures for how test cases, their inputs and/or their outputs differ. However,
common to these proposals is that they are data type specific and/or calculate
the diversity only between pairs of test inputs, traces or outputs.
We propose a new metric to measure the diversity of sets of tests: the test
set diameter (TSDm). It extends our earlier, pairwise test diversity metrics
based on recent advances in information theory regarding the calculation of the
normalized compression distance (NCD) for multisets. An advantage is that TSDm
can be applied regardless of data type and on any test-related information, not
only the test inputs. A downside is the increased computational time compared
to competing approaches.
Our experiments on four different systems show that the test set diameter can
help select test sets with higher structural and fault coverage than random
selection even when only applied to test inputs. This can enable early test
design and selection, prior to even having a software system to test, and
complement other types of test automation and analysis. We argue that this
quantification of test set diversity creates a number of opportunities to
better understand software quality and provides practical ways to increase it.Comment: In submissio
The Optimisation of Stochastic Grammars to Enable Cost-Effective Probabilistic Structural Testing
The effectiveness of probabilistic structural testing depends on the characteristics of the probability distribution from which test inputs are sampled at random. Metaheuristic search has been shown to be a practical method of optimis- ing the characteristics of such distributions. However, the applicability of the existing search-based algorithm is lim- ited by the requirement that the software’s inputs must be a fixed number of numeric values. In this paper we relax this limitation by means of a new representation for the probability distribution. The repre- sentation is based on stochastic context-free grammars but incorporates two novel extensions: conditional production weights and the aggregation of terminal symbols represent- ing numeric values. We demonstrate that an algorithm which combines the new representation with hill-climbing search is able to effi- ciently derive probability distributions suitable for testing software with structurally-complex input domains
Transferring Interactive Search-Based Software Testing to Industry
Search-Based Software Testing (SBST) is the application of optimization
algorithms to problems in software testing. In previous work, we have
implemented and evaluated Interactive Search-Based Software Testing (ISBST)
tool prototypes, with a goal to successfully transfer the technique to
industry. While SBSE solutions are often validated on benchmark problems, there
is a need to validate them in an operational setting. The present paper
discusses the development and deployment of SBST tools for use in industry and
reflects on the transfer of these techniques to industry. In addition to
previous work discussing the development and validation of an ISBST prototype,
a new version of the prototype ISBST system was evaluated in the laboratory and
in industry. This evaluation is based on an industrial System under Test (SUT)
and was carried out with industrial practitioners. The Technology Transfer
Model is used as a framework to describe the progression of the development and
evaluation of the ISBST system. The paper presents a synthesis of previous work
developing and evaluating the ISBST prototype, as well as presenting an
evaluation, in both academia and industry, of that prototype's latest version.
This paper presents an overview of the development and deployment of the ISBST
system in an industrial setting, using the framework of the Technology Transfer
Model. We conclude that the ISBST system is capable of evolving useful test
cases for that setting, though improvements in the means the system uses to
communicate that information to the user are still required. In addition, a set
of lessons learned from the project are listed and discussed. Our objective is
to help other researchers that wish to validate search-based systems in
industry and provide more information about the benefits and drawbacks of these
systems.Comment: 40 pages, 5 figure
Full Implementation of an Estimation of Distribution Algorithm on a GPU
We submit an implementation of an Estimation of Distribution Algorithm – specifically a variant of the Bayesian Optimisation Algorithm (BOA) – using GPGPU. Every aspect of the algorithm is executed on the device, and it makes effective of use multiple GPU devices in a single machine. As for other EDAs, our implementation is generic in that it may be applied to any problem for which solutions may be represented as binary strings. For the purpose of this paper, we apply it to a particular problem known to be difficult for metaheuristic algorithms due to high interdependency between variables: finding the lowest energy state of an Ising Spin Glass. We show that our GPU implementation demonstrates a speedup in excess of 80x compared with an equivalent CPU implementation. To our knowledge, this is the first EDA to be implemented fully on the GPU
Generating Structured Test Data with Specific Properties using Nested Monte-Carlo Search
Software acting on complex data structures can be challenging to test: it is
difficult to generate diverse test data that satisfies structural constraints
while simultaneously exhibiting properties, such as a particular size, that the
test engineer believes will be effective in detecting faults. In our previous
work we introduced GödelTest, a framework for generating such data structures
using non-deterministic programs, and combined it with Differential Evolution
to optimize the generation process.
Monte-Carlo Tree Search (MCTS) is a search technique that has shown great
success in playing games that can be represented as sequence of decisions. In
this paper we apply Nested Monte-Carlo Search, a single-player variant of MCTS,
to the sequence of decisions made by the generating programs used by GödelTest,
and show that this combination can efficiently generate random data structures
which exhibit the specific properties that the test engineer requires. We
compare the results to Boltzmann sampling, an analytical approach to generating
random combinatorial data structures
Adding Contextual Guidance to the Automated Search for Probabilistic Test Profiles
International audienceStatistical testing is a probabilistic approach to test data generation that has been demonstrated to be very effective at revealing faults. Its premise is to compensate for the imperfect connection between coverage criteria and the faults to be revealed by exercising each coverage element several times with different random data. The cornerstone of the approach is the often complex task of determining a suitable input profile, and recent work has shown that automated metaheuristic search can be a practical method of synthesising such profiles. The starting point of this paper is the hypothesis that, for some software, the existing grammar-based representation used by the search algorithm fails to capture important relationships between input arguments and this can limit the fault-revealing power of the synthesised profiles. We provide evidence in support of this hypothesis, and propose a solution in which the user provides some basic contextual knowledge to guide the search. Empirical results for two case studies are promising: knowledge gained by a very straightforward review of the software-under-test is sufficient to dramatically increase the efficacy of the profiles synthesised by search
Efficient software verification: Statistical testing using automated search
Abstract—Statistical testing has been shown to be more efficient at detecting faults in software than other methods of dynamic testing such as random and structural testing. Test data are generated by sampling from a probability distribution chosen so that each element of the software’s structure is exercised with a high probability. However, deriving a suitable distribution is difficult for all but the simplest of programs. This paper demonstrates that automated search is a practical method of finding near-optimal probability distributions for real-world programs, and that test sets generated from these distributions continue to show superior efficiency in detecting faults in the software. Index Terms—Software/program verification, testing strategies, test coverage of code, optimization. Ç